Dense Subgraphs with Restrictions and Applications to Gene Annotation Graphs

نویسندگان

Barna Saha

Allison Hoch

Samir Khuller

Louiqa Raschid

Xiao-Ning Zhang

چکیده

In this paper, we focus on finding complex annotation patterns representing novel and interesting hypotheses from gene annotation data. We define a generalization of the densest subgraph problem by adding an additional distance restriction (defined by a separate metric) to the nodes of the subgraph. We show that while this generalization makes the problem NP-hard for arbitrary metrics, when the metric comes from the distance metric of a tree, or an interval graph, the problem can be solved optimally in polynomial time. We also show that the densest subgraph problem with a specified subset of vertices that have to be included in the solution can be solved optimally in polynomial time. In addition, we consider other extensions when not just one solution needs to be found, but we wish to list all subgraphs of almost maximum density as well. We apply this method to a dataset of genes and their annotations obtained from The Arabidopsis Information Resource (TAIR). A user evaluation confirms that the patterns found in the distance restricted densest subgraph for a dataset of photomorphogenesis genes are indeed validated in the literature; a control dataset validates that these are not random patterns. Interestingly, the complex annotation patterns potentially lead to new and as yet unknown hypotheses. We perform experiments to determine the properties of the dense subgraphs, as we vary parameters, including the number of genes and the distance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HiDDen: Hierarchical Dense Subgraph Detection with Application to Financial Fraud Detection

Dense subgraphs are fundamental patterns in graphs, and dense subgraph detection is often the key step of numerous graph mining applications. Most of the existing methods aim to find a single subgraph with a high density. However, dense subgraphs at different granularities could reveal more intriguing patterns in the underlying graph. In this paper, we propose to hierarchically detect dense sub...

متن کامل

Mining Density Contrast Subgraphs

Dense subgraph discovery is a key primitive in many graph mining applications, such as detecting communities in social networks and mining gene correlation from biological data. Most studies on dense subgraph mining only deal with one graph. However, in many applications, we have more than one graph describing relations among a same group of entities. In this paper, given two graphs sharing the...

متن کامل

Discovery of Top-k Dense Subgraphs in Dynamic Graph Collections

Dense subgraph discovery is a key issue in graph mining, due to its importance in several applications, such as correlation analysis, community discovery in the Web, gene co-expression and protein-protein interactions in bioinformatics. In this work, we study the discovery of the top-k dense subgraphs in a set of graphs. After the investigation of the problem in its static case, we extend the m...

متن کامل

Discovering Large Dense Subgraphs in Massive Graphs

We present a new algorithm for finding large, dense subgraphs in massive graphs. Our algorithm is based on a recursive application of fingerprinting via shingles, and is extremely efficient, capable of handling graphs with tens of billions of edges on a single machine with modest resources. We apply our algorithm to characterize the large, dense subgraphs of a graph showing connections between ...

متن کامل

The distinguishing chromatic number of bipartite graphs of girth at least six

The distinguishing number $D(G)$ of a graph $G$ is the least integer $d$ such that $G$ has a vertex labeling with $d$ labels that is preserved only by a trivial automorphism. The distinguishing chromatic number $chi_{D}(G)$ of $G$ is defined similarly, where, in addition, $f$ is assumed to be a proper labeling. We prove that if $G$ is a bipartite graph of girth at least six with the maximum ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Dense Subgraphs with Restrictions and Applications to Gene Annotation Graphs

نویسندگان

چکیده

منابع مشابه

HiDDen: Hierarchical Dense Subgraph Detection with Application to Financial Fraud Detection

Mining Density Contrast Subgraphs

Discovery of Top-k Dense Subgraphs in Dynamic Graph Collections

Discovering Large Dense Subgraphs in Massive Graphs

The distinguishing chromatic number of bipartite graphs of girth at least six

عنوان ژورنال:

اشتراک گذاری